Reliability Society Newsletter - May 2013 Feature Article:

Failure of Successful Reliability Demonstration Tests

Guangbin Yang

Chrysler, Auburn Hills, Michigan, U.S.A.

gbyang@ieee.org



Many manufacturers have developed reliability programs effective for their products to be implemented throughout the product life cycle, which includes product planning; design and development; verification and validation; production; field deployment; and disposal. In the product planning phase, a reliability target is specified. Then various design-for-reliability techniques, such as the failure mode and effects analysis (FMEA), and design of experiments, are applied in the design and development phase to achieve the reliability target at a low cost and in a short time. Once a product design is completed and often followed by CAE analyses, prototypes are built for design verification (DV) testing. The purpose of DV is to verify that the design meets functional, environmental, reliability, and legal requirements. Successful DV tests enable the design to be released for a pilot production, which is intended to build the products that customers will see in the market. Samples are drawn from these products and subjected to process validation (PV) testing. Such testing is to prove that the production process is capable of manufacturing the products that meet the specified requirements.

Both DV and PV tests contain a reliability demonstration test, the test method of which includes the degradation test, life test, and bogey test [1]. In the DV stage this test is to demonstrate that the design achieves the required reliability, while in the PV stage it is to validate the process being able to manufacture the final products that meet the reliability requirement. After the design passes all PV tests, the design team often celebrates the conclusion of the project, and enjoys the success.

Some products fail unexpectedly in the field, and the complacency vanishes as warranty claims inundate. The unexpected failures certainly perplex the design team, because the specified reliability has been proved out. Then a natural question is, "Are the reliability demonstration tests correlated to the field application?" Unfortunately, the correlation can be seriously compromised if the tests involve one or more of the following pitfalls.

1. Unrealistic test conditions. Many reliability demonstration tests apply only one test stress (e.g., vibration, and temperature) with an oversimplified stress profile. Such a stress may not excite the failure mechanisms that will occur under the real world usage profile. Some tests employ multiple stresses, but samples are subjected to one stress at a time. This test method will miss any interactions between the stresses, which often shorten the product life. For example, if a plastic part is subjected to a high temperature and vibration at the same time, the sample will be softened causing a lower natural frequency, and thus may have a smaller chance to survive the test. In the field, products experience numerous stresses that come simultaneously, not sequentially.

2. Unrepresentative samples. In the DV stage, the test units are not samples although often so called. Rather, they are prototypes, which are specially built with fine workmanship, and selected materials and components with little variation. Even samples for the PV tests are not fully representative of the final products that customers will see in the market. The samples are manufactured in a pilot production, and thus contain little lot-to-lot variation. Needless to say, the demonstrated reliability is over optimistic.

3. Insufficient sample size. Samples are scarce and expensive, especially in the DV stage. A sample size of statistical significance is usually unaffordable in the competitive market. Many tests use an economic sample size; it is sometimes too small to provide a chance for each important failure mode to manifest. For example, a sample of size six is insufficient for a product that has more than six critical failure modes. If six samples are tested and fail in different failure modes, the remaining critical failure modes will escape from the test and likely occur in the field. A small sample size often results in favorable conclusions, especially when testing unrepresentative samples as described above.

4. Unfair trade between sample size and test time. In some applications, a longer test time is traded for a smaller sample size, or the other way around. The trade can be made statistically fair from, for example, the Weibull bogey test equations [1]. However, a smaller sample size reduces the chance of catching infant mortality problems [2]. Therefore, such a trade is not suitable for PV testing. On the other hand, a shorter time may miss certain failure mechanisms that develop at high time in service.

5. Lack of understanding of failure physics. To reduce test time, most reliability demonstration tests are conducted at accelerating conditions. Then the test data are extrapolated to estimate the reliability at the use conditions by employing an acceleration model. The extrapolation is dangerous if failure mechanisms are not well understood. Empirical models, such as the popular inverse power relationship, used to estimate the reliability outside the test range can be erroneous. Some accelerated tests apply higher usage rates by increasing operation speed or reducing off-time to compress the test time. The fact is often neglected that higher usage rates can produce more heat, which is not accounted for in the acceleration model, and thus shorten the cycles to failure [1].

6. Improper use of engineering standards. Some manufacturers mistake that their products achieve required reliability if the samples meet certain test standards, such as the MIL and SAE standards. Often, the test stresses specified in the standards are not highly correlated to the real world usage for particular products, and the test time is too short. Passing such tests does not demonstrate the reliability at the design life, although it provides a certain level of confidence on the reliability.

7. Misspecification of life scale. Some products, such as the automobile, have more than one life scale, typically including usage and age. In many cases, not all scales are closely related to the underlying failure mechanisms. The use of an inappropriate life scale leads to an erroneous specification of reliability, and a meaningless reliability demonstration. On the other hand, the failure of some products is governed by two or more failure mechanisms. Only one life scale cannot fully characterize the field reliability. For instance, the automotive catalytic converters fail due to thermal stress and chemical contamination. Thermal failure often is related to age, whereas chemical contaminants cumulate over usage.

8. Overconfidence on design change. Many designs cannot pass the DV and PV tests at the first time. The design team identifies the root causes of failure and makes design changes. Often the team is confident that the design changes have eradicated the failure mode. When the design release schedule is pressing, management sometimes takes an audacious decision to sign a deviation — releasing the design without subsequent needed DV or PV test. The decision poses a great risk although it is sometimes backed up by CAE analyses, which can be misleading when the CAE models are inadequate.

Root cause analyses of field failures often result in design improvement, or correction of production process, or both. However, the fact that the reliability demonstration tests failed to detect the inadequate design or process is usually overlooked. As a result, test improvement often is not a part of the remedial action. This should be corrected by taking at least the following actions.

1. Revisit and perhaps revise the engineering specifications concerning the test conditions, sample size, and test time. Warranty return part analyses reveal root causes of failure in the field. The test conditions must stimulate these failure mechanisms, and the test time should be correlated to the design life or a specified time of interest. For products subject to degradation failure, the test time may be reduced without sacrificing the confidence on the test results by utilizing the performance degradation measurements [3]. The sample size should not be smaller than the number of important failure modes identified in FMEA. This minimum sample size consideration is in line with the U.S. Senate model, which requires two seats from each state so that each of the two possible votes (Yea and Nay) has a chance to be cast.

2. The reliability demonstration test should use the degradation test method whenever suitable. This test method yields more information and requires a shorter test time than the life test method [4]. The bogey test method, although common in the automotive industry and some other sectors, is not recommended because it does not produce failure data and requires a large sample size.

3. The production process for building prototypes and samples should have the process steps and parameters as close to the final ones as possible. Do not build parts deliberately for passing the tests. Rather, parts in the weak tail may be created for testing.

4. Avoid using CAE analysis as a surrogate to reliability demonstration test. Even if a design or process change appears to be insignificant, a subsequent demonstration test should be conducted, rather than relying merely on the CAE analysis. If the CAE analysis failed to detect a failure mode at the first place, why can it be trusted before a validated improvement?

Reliability demonstration test is usually considered as a tool to prove out the achievement of reliability, especially by governmental contractors. Indeed, it is the last measure to catch a defective design or process. Because of this, the above recommendations should be taken proactively, even for a design or process that appears to be free of defects.


REFERENCES

[1] Yang, G., Life Cycle Reliability Engineering, Wiley, Hoboken, 2007.
[2] Meeker, W. Q., "Book review: Life Cycle Reliability Engineering," Journal of Quality Technology, vol. 41, no. 2, pp. 345-348, 2008.
[3] Yang, G., "Reliability demonstration through degradation bogey testing," IEEE Transactions on Reliability, vol. 58, no. 4, pp. 604-610, 2009.
[4] Yang, G., "Accelerated degradation tests for rapid reliability evaluation," Tutorials of Annual Reliability and Maintainability Symposium, 2013.


Download Full Article